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ABSTRACT 


This thesis is an investigation into the value of intelligence on 
enemy position and strength during a simulated battle experience. An 
experiment was conducted to determine if there was an amount of intelli- 
gence which could statistically be shown to be optimal, with more or less 
intelligence resulting in a degradation in performance by the decision 
maker. A variation of chess was utilized as the basic war gaming model. 
Subjects were provided different levels of intelligence on the enemy's 
strength and position. A computerized chess game calculated all enemy 
moves. All aspects of the experiment, including filtering of intelligence, 
communications between display terminals, and data collection were under 
software control. 

The analysis of the data obtained from the experiment suggests that the 
amount of intelligence provided did correlate with player performance, and 
that there exists a level of information such that additional information 


leads to decreased performance. 
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I. INTRODUCTION 


A. BACKGROUND 

Voluminous amounts of research have been conducted in the recent past 
concerning what information is used by a leader to make decisions on a 
tactical or strategic battlefield. Studies considering the value of 
intelligence to a decision maker make up a quite substantial proportion 
of this research. It was our hope that this thesis could add a little to 
the understanding of the importance to a force commander of military intelli- 
gence about the enemy. 

Most of the research on the value of intelligence has been loosely 
Structured and is subjective in nature. The tremendous scope that sur- 
rounds the whole idea of studying the "value of intelligence" seem to predi- 
cate a broad overview style of research rather than rigorously controlled 
Seneneitic effort. 

Our goal was to look at the value of intelligence in a quantitative way. 
To accomplish this, 1t was recognized that the magnitude of this study must 
be strictly confined in order that numerical results, rather than simply 
observations or personal impressions, could be obtained. Conclusions 
based on real experimental data would be sought, not generalized opinions 
or observations. 

There are a number of ways to gather data to study the value of intelli- 
gence to a leader. Past research has used everything from historical reports 
on actual battles or wars, to results of training exercises and operations. 


The war game is a vehicle becoming more in vogue to generate useful data for 
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analysis of this kind. Because of our desire for a strictly controlled 
environment for the thesis's investigation into the value of intelligence, 
a war game seemed to be a desirable medium to use. An experiment conducted 
using a "credible" war game, we believed would provide the definitive con- 
clusions necessary to numerically justify the inferences expected to be 
made on the value of intelligence. 

War games can be classified in a multitude of fashions, such as purpose 
of the game, scope or level of the game, type of simulation or model, 
method of evaluation, and level of abstraction. The type of war game we 
sought would be classified in the JCS Joint War Gaming Manual [Ref. 1] as 
a research type war game. Our need for the game was for use as a testing 
vehicle for research into the value of intelligence. 

Current computerized war games range from very large analytical simula- 
tions which take hours to calculate one game turn, to smal] educational war 
games designed to provide the players with semi-realistic decision-making 
experience. These types of games are designed to train individuals, not 
provide analytical data for experimentation purposes. Most lack the cru- 
cial ability to provide computer generated decisions. War games such as 
Naval Warfare Interactive Simulation System (NWISS) and the McClintic 
Theater Model (MTM) rely on the human to make the decisions. The computer 
simply keeps track of the multitude of parameters concerning the current 
game situation, and updates those parameters, given the decisions made by 
the players. We sought a cne-sided game where the computer could provide 
quality, consistent decisions over a number of game turns. 

These above requirements lead us to the selection of a variation of 


chess as the principal war game for the experiment. The reasons behind 
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selecting chess over more complex war games will be discussed in greater 
detail in the experimental design chapter. It will suffice here to say that 
chess met the requirement of being simple enough to allow us to manipulate, 
automate, and analyze it, while still maintaining what we and many others 


believe to be a reasonable surrogate of a battle experience. 


B. PURPOSE 

The purpose of this thesis was to design, conduct, and analyze an experi- 
ment which would allow us to study the value of intelligence in a low-level 
war gaming situation. The experiment was to confront military subjects with 
a variety of different amounts of information about "friendly" (white) or 
"enemy" (black) positions on a chess board. Data would be collected from 
these confrontations (trials) and from the analysis of this data it was hoped 
that some definitive results as to the "value of intelligence" could be 
obtained. 

A number of different hypotheses are possible when studying the value of 
intelligence. Intuition told us that it was probably true that the more 
information given, the better a commander would perform, as long as the 
information provided was relevant to the situation, and the quantity of the 
information was controlled to avoid "information overflow’. It was our de- 
sire to prove or disprove this intuition, by studying information as it 
applies to intelligence about the enemy. 

Therefore, our primary hypotheses for the experiment were: 

1. The amount of intelligence provided to a decision maker (in this 
case a chess player) on enemy positions and strength was positively corre- 


lated with the player's performance after using the information provided. 





¢. There was a specific amount of intelligence which could be shown to 
be optimal, and if more or less than the optimal amount of intelligence was 
provided, the performance of the player would be degraded. 

There were other related secondary hypotheses or experimental issues 
which we felt naturally arose from the testing of the primary hypotheses. 
One of the most obvious issues that must be considered was that it is prob- 
ably true that experienced commanders performed better. In our experiment 
this would relate to the better the chess player, the better the performance, 
regardless of the amount of intelligence provided. Another hypothesis would 
be that additional intelligence might be of more use to the weaker, less 
experienced decision maker. A strong chess player might not use or even want 
information that a weaker player possibly would find extremely useful. 

And finally as a postulate to these hypotheses, we hoped to show that 
one of the most basic of war games (chess) could be used to formulate defini- 
tive results which could meaningfully contribute to overall understanding 
of the research area. Although ancient in design and considered rudimentary 
in scope by some war gamers, the fundamental ideas of position, strength, 
and movement in chess could be naturally related to the same decision-making 
parameters one must take into account to make acceptable decisions on the 
modern battlefield. There are of course differences in rules, timing, scope, 


and magnitude, but the underlying principles are the same. 


C. APPROACH 
The approach taken consisted of four distinct phases, each of which will 
be reported on in a separate chapter of this thesis. In general, the four 


actions were: 
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1. Formulate the hypotheses we wished to test and design an experi- 
ment which would allow us to accurately evaluate these hypotheses. Chapter 
Il reports on areas such as selecting the appropriate war game, devising a 
credible measure of effectiveness, aPre velop ino a suitable mathematical 
model which would support the experimental aims. The criteria involved in 
selecting the necessary subjects for the experiments, along with equipment 
requirements, are also discussed. 

2. Design the software needed to support the experiment. In Chapter 
ITI the computer programs developed for the experiment are examined. We 
did not feel it necessary to include the multitude of code written for the 
experiment, in the thesis. We felt it more appropriate that each of the 
sections, along with some of the modules contained in the sections, be dis- 
cussed in a more holistic manner within the body of the report. This 
should lead to a better understanding of just what was required of the 
software and how it fulfilled those requirements. 

3. Conduct the experiment. Chapter IV recounts the particulars on the 
actual execution of the experiment. The procedures involved in administer- 
ing and controlling the experiment are explained, together with more de- 
tailed information on the players and equipment used. 

4. Analyze the results and draw our conclusions. Chapter V explains 
the approach and the specific statistical tools used to reduce the data. 


The conclusions we reached are presented in Chapter VI. 
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II. EXPERIMENTAL DESIGN 


eee SUBJECTS 

The desired subject for our experiment was an adult with some military 
experience that could be brought to bear in making decisions of a tactical 
nature. The particular service connection would be unimportant because the 
game chosen would not favor any particular military background. The sub- 
ject would preferably have some familiarity with the game of chess but 
would not be a highly experienced player. Also, the subject should be 
familiar mach the use of computer terminals in general as a communications 
device to avoid possible contamination of the results of the experiment from 
computer angst. 

The other driving factor was that a sufficiently large group of subjects 
was required to obtain enough data points for analysis. Exactly how many 
were enough could not be determined at the outset because the sample size 
would, of course, depend on the other design factors and the amount of 
data scatter actually observed. After completing the initial design of the 
experiment, pilot trials were conducted to get some idea of the variation 
in scores we could expect and to refine some of the experiment parameters. 
Based on those trial runs we felt that "the more the better" was the 
answer, expecting that the number of volunteers we could enlist or conscript 
would be smaller than the number physically possible to process in the time 
we were allotted for priority use of the WAR Lab facilities and that we 


would probably observe significant variations in the scores. 








As things turned out, we had thirty-one subjects available meeting our 
criteria, and the processing of their trials and data took essentially the 
entire time available. Time played a part in other design parameters, 
also, as will be discussed later. For more specific information on the 


subjects actually involved, see Chapter IV, Conduct of the Experiment. 


B. APPARATUS 

The experiment was run in the Wargaming Analysis and Research Laboratory 
(WAR Lab) at the Naval Postgraduate School. A computer capable of handling 
the required interfacing, multi-terminal coordination, visual display, and 
data collection was essential. An atmosphere in which terminals and work 
Space could be reserved for experimental use and in which the subjects and 
umpires could be relatively undisturbed was equally important. The WAR Lab 
offered those advantages. 

All artificial intelligence in the process of deciding Black's moves 
was provided by the "Super-Nine Chess Challenger" (a commercial chess game 
manufactured by Fidelity Electronics, Ltd.). 

The VAX-11/780 mini-computer in the WAR Lab was also used with a separate 
chess game program to allow the subjects to play practice games for nomencla- 
ture and chess familiarization. 

More detail about the specific equipment used can be found in Chapter 


IV, Conduct of the Experiment. 


C. PROCEDURES 
The first step was the selection of the hypothesis and a suitable 
vehicle with which to test it. We believed that the amount of information 


a decision maker has and the results of his decision are correlated. We 
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thought this to be particularly true in the field of military intelligence 
and combat. We expected that there would be an optimum level of intelli- 
gence; too little intelligence resulting in too much uncertainty and too 
much resulting in an overload of the information assimilation process which 
could disguise key issues in the flood of minutia. To test our hypothesis 
we needed a test bed of some sort. To improve credibility of the results 
an actual combat situation would have been the best test bed but that is 
obviously not practical. It would also not allow for duplication and would 
be virtually impossible to control rigidly. For the same reasons a field 
exercise, probably the next most credible format, was not practical, 
either. War games are generally accepted as the next echelon of credibility 
for military situations and can be run economically, repeatedly, and with 
varying degrees of control. Ideally, the war game selected would have some 
easily arrived at measure of effectiveness (MOE), be of relatively short 
duration, require only one subject at a time, not be so closely allied to 
one area of combat as to give significant advantage or disadvantage to 
Subjects with any specific military experience, and yet would still be a 
Suitable surrogate for combat to be of value for demonstration. A critical 
consideration in running an experiment is the ability to control and account 
for the factors that may influence its outcome. Unfortunately, realism of 
the battlefield environment and rigid exDderimental control are diametrically 
opposed conditions. In order to keep our experiment as simple as possible 
and to be able to extract statistical data for hard analysis we opted for 
tight experimental control and sacrificed battlefield realism. 

A variation of chess was chosen because it most nearly met all the 


criteria. Chess originated as a war game and is the oldest surviving one 
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in the western world. In the nineteenth century the German Army's General 
Staff used a variation of chess called "Kriegspiel" ("war game") as a train- 
ing aid in tactical and strategic thinking. In "Kriegspiel," two opponents 
play chess but each sees only his own pieces on his board; an umpire pro- 
vides the necessary interface between the two players. The object is still 
to destroy the enemy by capturing the enemy's king, but the process is much 
more difficult. Growl eee ae the opponent's strength and position must be 
derived from scouting, losses, engagements, etc. 

The game that we used in our experiment is another variation of chess 
in which the amount of intelligence provided can be controlled by software. 
In a normal chess game two opposing players match wits developing strategy 
and counter-strategy until one is beaten. As with any other human endeavor, 
the skill with which one plays varies from game to game. This is true even 
for the greatest chess masters; certainly it would be true had we used one 
of our umpires to always play the opposition. Having our subjects play 
against each other was an even less viable solution. That would have re- 
quired a larger number of trials to get sufficient data, would have required 
much more of each subject's time, and would have provided very inconsistent 
opposition. We felt it was important to provide a consistent opposition 
for our subjects in order to remove the possible confounding of two sub- 
jects' relative chess acumen and to avoid the necessity of a very large 
number of trials. Therefore, all subjects played against the same computer- 
ized chess game at the same level of play. We chose to set the Chess 
Challenger at its lowest (easiest) level for a number of reasons. First, 
how well the Chess Challenger did was not important as long as neither 


white nor black frequently decimated the other. We expected that most of 
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our subjects would be novice chess players so that even in a normal game 
the artificial intelligence in the Chess Challenger would be a significant 
challenge. How well the Chess Challenger scored was far less important to 
the experiment than the fact that its play was always at the same level. 
Therefore, while a higher level might have been more of a challenge to an 
advanced player, it was more important to use a level that would be easy 
enough to give our novices a chance at avoiding early checkmate. Pilot 
trials indicated that the lowest level of play on the Chess Challenger 
would provide an adequate challenge. Our decision was proven correct in 
that even against the lowest level setting none of our subjects were 

ahead at the evaluation point. Lastly, to minimize the minimum time re- 
quired for each move we wanted the reaction of the Chess Challenger to be 
as quick as possible. Using its lowest level meant that its decision tree 
analysis was kept simple with a resultant decision time of approximately 
five seconds. The computerized opponent always had perfect information 
(the normal view of the chess board with all active pieces). 

The second step was selection of a suitable MOE. This is often a diffi- 
cult problem. Consider the example of trying to evaluate the effectiveness 
of a new anti-aircraft system protecting a critical target. Possible MOE's 
are the number of bombers shot down per 1000 rounds fired or the amount of 
damage suffered from bombing before and after installation of the new system. 
Which is a better MOE depends on what one really means by the “effectiveness' 
of the new system. Correct choice of an MOE requires careful scrutiny of 
the underlying questions one is trying to answer and accurate translation 
of the requirement into measurable quantities. At this stage we made a 


general decision to use a point value system based on material strength, 
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board positions, and mobility. Points would be determined for the initial 
board and then again at some number of moves later; the algebraic differ- 
ence in the scores, corrected for any penalties became the MOE. The 
specific method of doing this and assigning penalties was worked out after 
the other details of the experiment were determined and is described in 
Chapter V. 

The next major step was to develop a mathematical model of the experi- 
ment that would account for as many parameters and influences as possible. 
To allow analysis for differences in chess playing expertise each subject 
was asked at the beginning of each trial run to indicate into which of 
Four categories the subject fit: novice, some experience, frequent player, 
or tournament player. To remove foreknowledge of the board setup we 
Started each trial in mid-game. To minimize the possible “learning” of 
initial piece dispositions, four separate mid-game start points were con- 
structed [Figures II-1, I1-2, I1-3, and II-4]. We felt this was important 
because otherwise a player with any experience at all would start out with 
“perfect information" about the opposition regardless of what information 
was displayed. In such a case a player's experience as a chess player 
would take on even more significance due to the expert's ability to 
extrapolate probable black counters to his moves, knowledge of opening 
game strategy, etc. Using a mid-game start point also offered the advan- 
tages of avoiding end-game strategy (in most cases), of clearing the board 
Somewhat to facilitate greater movement, and of eliminating most of the 
uninteresting early swapping of pawns. These positions came from playing 
the first eight to ten moves of tournament games discussed in the chess 


column of the local newspaper and stopping at a point of approximately even 
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strength. The sequence in which the subjects faced the initial Setups was 
varied systematically so as to appear random to the Subject but to yield 
approximately uniformly distributed sequences of play. For example, the 
number of subjects playing with intelligence level one against board setup 
one on their first game was approximately the same as the number facing 
intelligence level three and board setup two on their first game, etc. The 
order in which different information levels were used was Similarly varied. | 
By collecting "trial number" as one of the experimental parameters, it was 
possible to analyze the effect, if any, of learning. By using the four 
setups to eliminate one confounding factor, we introduced another, the 
effect of playing one setup versus any other one. Therefore, the initial 
game setup was always recorded for analysis of its effect. The final item 
to go into our experimental design and the major factor of interest was 

the intelligence level provided to a subject during a game. We devised six 
different levels of information (which are explained in detail in Chapter 
IV) but found that it was not feasible to use them all. The six levels 
came about by determining what specific types of information could be pro- 
vided and how those types could be combined. Practical limitations on the 
time allotted to run the experiment and use the facilities, on the amount 
of game time each subject was willing to provide, and on the number of 


Subjects available forced the reduction to some smaller number. On the 


the basis of trial runs conducted with ourselves as subjects we felt four 


‘Table V-1, the data file from the experiment, shows the sequence of 
play. One of the umpire's actions before the game was to use a copy of 
the uncompleted data file as the guide for selecting the appropriate 
intelligence jevel and board setup for each subject on each trial. There 
is no significance to the assignment of subject identification number. 
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levels were the maximum that could be used. That was a compromise. The 
four selected were chosen on the basis of those trial runs so that we had 
a good spread of game types (intelligence levels); choosing the three lowest 
levels plus one more, for example, would have probably produced only a 
small or nonexistent spread of scores because all three allowed very 
little information to the subject. Intuition and our trial runs showed the 
four levels selected held the best promise for delivering meaningful data 
points. 

Combining the factors discussed above, we can represent our experi- 
mental design model as: 


=a +8.+y7. + 
Y a B. ( hy - Oy +e 


where: 


=e 
Hl 


MOE 


a = some base level 


8.= information levels 


y.= trial number 
Oy = initial board setup 


04> subject's experience level 


e = unknown or uncontrolled error. 

Other factors were considered but not represented in the model. As a 
surrogate for the operational pace of combat decision making, the subject 
was allowed two minutes per move without penalty. Two minutes was arbi- 
trarily chosen after pilot trial experience showed it to be adequate. 


Practical constraints were also a factor. Allowing two minutes per move 


22 





provided adequate time for each subject to play four games during the three 
hours each was available. The subjects were told that there was a penalty 

for exceeding two minutes per move but were not told the exact nature of the 
penalty. We did that to force the pace of the game without introducing the 
question of intentionally trading a known penalty for additional decision 
time. Any penalty was assessed after play stopped at the rate of one pawn's 
material value (256 points) for each one minute or fraction thereof of cumula- 
tive time over two minutes per move for the ten moves of each trial. However, 
to allow the player to study the initial conditions, the first move was not 
penalized. The number of moves per trial until evaluation was set at ten 
based on pilot trials to determine a suitable number. Too few moves would 

not allow time for the various factors to effect the score; too many moves 
would cause a large number of subjects to be checkmated resulting in a skew- 
ing of the scores. We felt that the alphanumeric board and move representa- 
tion on the computer terminal would help mitigate the expected differences 

in the subjects' chess expertise. A similar chess game using the same 
representation and nomenclature was made available to all the subjects for 
practice. To prevent the introduction of another element into the experiment, 


outside aids such as a conventional chess board were not allowed. 


23 





III. SOFTWARE DESIGN 


The design and development of the software needed to support the 
experiment took place over a 2 1/2 month period from July to September 
1983. The final software product consists of 25 subroutines, presently 
located in 3 files on the VAX 11/780 computer in the WAR Lab. The files 
are open for public review in the .THESIS subdirectory of the CHESS direc- 
tory, under filenames UMPIRE.FOR, PLAYER.FOR, and COMMON.FOR. 

All programming was done in standard VAX-11 FORTRAN 77 [Ref. 2], with 
the exception of the inter-process communications via a systems mailbox. 
That code was written in FORTRAN formatted VAX 11/780 Systems Programming 
Calls. [Ref. 3] 

The code is of course compatible with any Digital Equipment Corporation 
computer system running under a VAX/VMS operating system. With the possible 
exception of the inter-process communications mentioned above, and some 
Specific intrinsic function calls not supported in standard American 
National Standard FORTRAN-77, the program should compile and run on any 


computer system which has a FORTRAN-77 compiler. 


EeeeeCOMeUTERIZED CHESS 

To design a program which will intelligently play a game of chess is a 
tremendous undertaking. It was decided early in the formulation of the 
experiment that to actually write a program which could play chess with 
even a small amount of skill was not only far beyond the scope of this 
experiment but also unnecessary, since extremely competent chess programs 


exist and could be utilized easily. 
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A great deal of software was still needed to control the play of the 
game. This software was to be designed in accordance with established 
computerized chess principles. Although no actual artificial intelligence 
would be programmed to determine moves, all other portions of the game 
required software support. In addition, a major Programming effort was 
required to construct the additional masks and screens necessary to con- 
trol the amount of information given to the subject during the actual 
experiment. Since it had been decided that there would be little or no 
direct personal interaction between the subject and the umpire, all the 
communications which was to take place between the umpire and player, 
along with the amount of intelligence provided the player, required 
FORTRAN coding. 

The basic ideas involved with the representation of chess in a computer 
are really quite simple. A chess board consists of 64 squares, organized 
in a square 8 X 8 matrix with a single item ‘the playing piece) possibly 
Sitting on top of each of the squares. These pieces move around on top 
of the board according to specific rules which govern each type of piece. 
There is an object to the game, i.e. capture the opponent's king, and many 
general rules, such as pieces can not move off the board, which control 
overall play. 

Currently there are two generally accepted methods used by computerized 
chess designers to represent a chess game inside a computer. One is the 
Shannon method, which will be described in much greater detail below. This 
method uses numerical arrays to represent the board, and utilizes a large 
set of procedures which emulate the general and specific rules of the game. 


Tne other method of representation is known as the "bit board" representation 
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and utilizes a series of 64 bit words to portray the basic board, and al] 

of the rules associated with moves from any position, using any piece, on 

the board. [Ref. 4] These "bit boards" enable the processor to do simple 
boolean logic operations such as 'AND' and 'OR' on combinations of the 64 

bit words to generate rules. Since fetches from memory and logical opera- 
tions are much faster than long procedures, this method reduces processor 

time, which is a most critical commodity if the chess game is to formulate 
computer generated moves. 

For this experiment though, speed of computing was not a factor because 
the software did not generate chess moves. The Shannon method of game 
board representation was therefore chosen for the foundation of the chess 
program's design because of its overall simplicity and also the ease with 
which it can be programmed. Shannon suggested [Ref. 5] that each square on 
the chess board be looked upon as a "mailbox" which certain attributes, for 
instance whether the square has a piece on it or not, are stored. His 
Original idea was to have sixty-four such "mailboxes" for the sixty-four 
Squares on the board. More recent programs modified this representation to 
include hypothetical squares which are off the board. [Ref. 6] Our internal 
board representation took this updated approach and consisted of a one 
hundred twenty element array which could be though of as a 10 X 12 square 
board. [Figure III-1] Each mailbox could contain either zero, a positive 
or negative number between one and six, or the number ninety-nine. These 
numbers would tell the status of each square at any particular game time. 
For instance: a 0 meant the square was empty, a +] meant there was currently 
a white pawn at that square, a -4 meant the square was occupied by a black 


rook, and the number 99 depicted a square which was off the playing 
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board. [Table III-1] Using this system of off-the-board squares, the 
edges of the board could be easily detected. Although the necessity of 
these squares is not initially obvious, the use of these off-the-board 
squares should become clear when an example of a move is explained in the 


LMC subsection. 


B. OVERALL STRUCTURE 

Ignoring the differences between the umpire and player programs for the 
time being, the overall structure of the software consisted of seven dif- 
ferent sections, each of which was designed to call various subroutines at 
different times during program execution. These sections are described 
below. 

1. Introduction and Initialization 

This portion of the code did the start-up and initialization chores, 

queried the user for primary experimental data entries, initialized the 
default playing board (or, if the user desired, set up a board to the 
player's specifications), and set up the timer used to time the length of 
each move. 

c. Parser 

The parser section converted the user's typed in move to the 

internal representation of the move. Depending on whether the umpire or 
player entered the move, the move would be entered in either the basic 
chess movement scheme (i.e. P/KB2 -KB3) or in the Chess Challenger's 
board portrayal (i.e. P/F2 - F3). This move would then be converted into 
the internal square and piece designations. Using the above move as an 
example, P/KB2 -KB3 would be translated into, if it is white's move at the 


time, pawn (+1 for white) at internal square 37 is to be moved to square 47. 
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The parser checked for illegal entries, and if an illegal move is made, 
the parser informs the user and continues to ask for moves until a legal 
entry (not necessarily a legal move) was made. 

3. Legal Move Checker (LMC) 

This portion of the program was one of the most complex and required 
a great deal of design effort and debugging time to get working properly. 
The idea behind the LMC was to determine whether an attempted move was 
legal, given that the LMC knew the origin and destination square, and the 
type of piece which the user wanted to move. Legal moves can then be deter- 
mined by noting the mathematical relationships between squares. 

For example, if a white knight was to be moved from its default 
starting position at QN1 to QR3, the possible legal moves can then be cal- 
culated by adding the following offsets to the origin square. 

Origin Square = 23: +8, +19, +21, -8, -19, -2]1, & -l2 
Each of these squares are then matched against the destination square. If 
a match occurred, and the square is not occupied by a friendly piece or 
located off the board, it is a legal move. Figure III-2 shows how the de- 
fault board would be internally represented at the start of a game. Using 
Figure III-1 as a guide, the above example shows that adding +19 and +21 
are the only two legal moves from square 23. These two squares (42 and 44) 
are the only 2 squares which have a 0 or negative number in them. Al 
other offsets contain either positive numbers (it is illegal to move on top 
of one of your own pieces) or a 99 (which means the attempted move is off 
the board). Since the desired move is QR3 (square 42) this then is a legal 


neve. 
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This is a highly simplified example of how the LMC works. The 
moves for sliding pieces such as rooks or bishops are much more complicated 
to check, but follow somewhat the same principle using offsets and compari- 
sons. The LMC looks only at regular moves and capture moves. A bit more 
of a streamlined approach could have been taken to eliminate some of the 
redundant offset additions; however, some of the same code was to be used 
in other sections of the program. Therefore some efficiency was sacrificed 
for clarity and generality. A more detailed explanation of this section 
can be found in the comments of the program's source code, or in an in- 
formative book on computerized chess which was used extensively in model- 
ing the movement portion of the chess program. [Ref. 7] 

4. Display 

The display section was responsible for all output to either the 
terminal, the line printer, or separate files. It consisted of subroutines 
or modules which performed the following functions: 

a. Display the board after each legal or illegal move. The 
internal representation of the board had to be converted into a representa- 
tion suitable for display on the output device. The board display type 
chosen consisted of eight lines of 2 symbol groups which were either 
dashes or asterisks for white or black squares which were empty, or two 
letters to depict a piece that occupies a square. [Figure III-3] A black 
King for example would be BK and a white knight would be displayed as WN. 

b. Decide how much information should be removed or added to the 
normal board display. This code, coupled with the intelligence determina- 
tion modules of section 5 below, insured that the proper amount of intelli- 
gence was displayed to the subject, given the game scenario being run at 
the time. 
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e: Display a safe board to the subject. During different 
scenarios this board would be displayed to the subject using ae lines 
of three letter groups which informed the subject which Squares on the 
board were safe from attack. [Figure III-4] The letter groups were in 
Standard chess square terminology (i.e. KB7 = he square king bishop 7) and 
would be displayed only if that square was safe from attack. 

d. Qutput to a separate file each move and the time it took to 
make the move. The program was designed to output each move to insure 
that if any data was lost, the game could easily be reconstructed. The 
standard chess move format was chosen for output; therefore, the umpire's 
moves required translation from Chess Challenger format before they could 
be written to the file. 

e. Collect and save data points. Specifically at turn ten of 
each game, and additionally at any time the umpire chose, the program would 
query the umpire for an evaluation of the board situation at that time. 
This information, along with the actual board position, and all of the 
other experimental independent variables, would then be saved in the data 
file for future analysis. 

f. Display to the other player the move that was entered. Since 
the umpire and subject played the game using different game board represen- 
tations, each move required translation into the other player's format 
before it could be displayed on the opponent's terminal. 

5. Intelligence 

This section modelled the heart of the actual experiment since its 

functions were to derive the information on attackable pieces, pieces that 


were safe from attack, and squares that would be safe if a piece was to 
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move there. The primary subroutines of this section were WHITE & BLACK- 
ATTACK and SAFE-BOARD. These subroutines in turn called many of the same 
routines used in the legal move checker, however normally with different 
input parameters and common variables. The basic idea of any of these 
routines was to check every square on the board for possible legal moves 
from that square, depending on whether pieces that could be attacked, or 
safe squares, were sought at the time. These possible legal moves would 
then be matched against the playing pieces relevant to the intelligence 
needed and a board would be constructed which would simply contain yes or 
no to the question of whether the square was, let's say safe from attack. 
These "boards" were just arrays of boolean variables which could then 
easily be matched one for one with the actual game board to display the 
proper information for the scenario being played at that time. 
Ge) CaStl ing 

Castling, because of the many rules involved in this maneuver, was 
handled separately from the rest of the movement sections. Before a castle 
could be made, numerous rules had to be checked that were different from a 
normal move's rules. Also the move required the relocation of two pieces 
rather than the usual one. The parser would identify a request for a 
castle and then the following rules had to be checked before the move could 
be made. 
Are the king and rook in their proper positions? 
Has the king or rook you wish to move, moved before? 
Are there any pieces between the king and rook? 


Is the king in check? 
Will the king move into or through check during the castle? 


mp O.n & mw 


If all of these questions are resolved satisfactorily the castle would take 


place as requested. Otherwise, an illegal move message would be displayed 


on the terminal and the player would be asked for a different move. 
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7. Exchanging of Pawns 


This section dealt with the situation which occurs when a pawn 
reaches row eight of the game board. If this situation occurred during 
the movement of a pawn, the code would query the user for the type of 
piece to be exchanged for the pawn, and then make the substitution as 


required. 


C. MAIN PROGRAM 

The above sections were integrated with a main program (either player 
or umpire) to form the executable program module. The subroutine integra- 
tion was done primarily at linkage edit time since some modules were used 
in both the player and umpire processes. In addition to the seven major 
sections, tnere were a few other minor segments, such as a routine which 
determines whether a piece is in check, and a portion of code which would 
check to see if the player was still in check after a move was made. 

The main program is a large repetitive loop. The flow of control would 
normally go through sections b, c, d, and e on each move [Figure III-4]. 
Section a would be executed only upon program start-up or reset. Sections 


f and g were executed only on demand. 


D. DIFFERENCES BETWEEN UMPIRE AND PLAYER PROGRAMS 

The overall structure of the umpire and player programs are generally 
the same. Each program is designed to run as a separate process. All sub- 
routines which were common to both the umpire and player programs were 
compiled as separate routines and linked into each program separately. 
There are some basic differences between the two processes though, and they 


required individual program code. 
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1. All data collection and output was handled completely by the 
umpire terminal. The player had no control over the data that was saved. 

2. The umpire would always see the highest level of intelligence at 
his/her terminal. The player would see only what was selected for the 
player to see by the umpire. 

3. All timing was conducted only in the umpire program. 

4. The umpire process handled all the translations between move for- 
mats. These translations included: Chess Challenger to regular chess, 
regular chess to Chess Challenger, and perspective changes such as a move 
in the black's perspective translated to the same move in white's perspec- 
tive. A black to white translation meant, for instance, that if black was 


going to move a piece from his KR3 to KR4, white would be told the move from 


his perspective, i.e. KR6 to KR5. 


E. COMMUNICATIONS BETWEEN TERMINALS 
All communications between the umpire and player were handled by the 
creation and use of a systems "mailbox". This mailbox acted as a buffer 
between the two processes. The two programs passed information to and from 
the mailbox using the VAX 11/780 systems input/output (queued) routines. 
The mailbox size was 600 bytes. Although quite large, the majority of 
the time the only information passed through the mailbox was the actual 20 


byte move entered by the player or umpire. The large size was necessary to 


lthis did not present a problem, even though it was the player's move 
that was being timed, because the actual time span being measured was from 
the time the umpire's move was sent to the player, to the time a legal move 
Was received back from the player. 
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pass the initial board set up, entered by the umpire, to the player process 
before play could begin. 

Each of the two processes was responsible for determining whether a move 
was legal or not, and then carrying out the actions required by the decision. 
Therefore, if the player entered an illegal move, the umpire terminal would 
receive the move and determine its illegality, just as would the player 
terminal. The system was designed in this matter to help the umpire control 


the flow of the game, even though it is definitely redundant in nature. 


F. VARIABLES 

Both local and global variables are used in the overwhelming majority 
of the software modules. Because of the nature of the program, and its 
need for an extensive amount of variables passed between procedures, common 
blocked variables were chosen over large parameter lists. Each common 
block was tailored for a specific use so that the number of global 
variables required in each subroutine could be kept to the minimum needed 


to perform its necessary functions. 


G. COMMENTS 

A final word on the structure of the comments and other documentation 
added to the program. At the beginning of each subroutine is a descrip- 
tion of each variable local to that specific module, and each input or 
Output parameter of that subroutine. At the beginning of each of the two 
Main processes are descriptions of all global variables common to any or 
all the procedures of the process. Comments are interspersed throughout 
all the software. We tried to comment blocks of code as much as possible, 
rather than individual lines, to help in identifying program structure and 


enhance the readability of the code. 
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IV. CONDUCT OF THE EXPERIMENT 


At the earliest stages of formulating this experiment an eee was 
reached between ourselves and CDR Gary Porter, the instructor of the fall 
class of 0S-4602, ¢? Systems Evaluation, to utilize the students in his 
class as subjects for this experiment. In exchange for the use of his 
class and classroom time, we would allow the experiment and its results to 
be used in class as a learning tool for teaching experimental design. 
Therefore, the time frame to conduct the experiment had to be convenient 
for both CDR Porter's class objectives and this thesis' requirements and 
goals. The time period agreed upon for execution of the experiment was a 
two week period in early October 1983. The actual experiment took a week 
longer than expected, lasting from 10-28 October 1983. The extra week was 
needed due to the determination, as the experiment proceeded, that some 
additional data points would be required for data analysis. Additionally, 
there was a significant number of the subjects that were unfortunately 
scheduled to be absent during a large portion of the initial two weeks, and 
there was not enough time in the remaining days to run these students 
through the experiment before tney left. 

The entire experiment took place in the WAR Lab of the Naval Post- 
graduate School (NPS), Monterey, California, located on the first floor in 


Ingersoll] Hal}. 


A. PLAYERS 
Overall there were 31 individuals who took part in the experiment. All 


students were military officers, with rank ranging from a Lieutenant 
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Colonel/05 to Lieutenant/03. All services were represented with 14 
officers from tne Navy, 7 from the Army, 8 from the Air Force, and 2 from 
the Marine Corps. Experience levels in playing chess will be discussed in 
more detail in the analysis chapter. It will suffice here to say that the 
experience level of the subjects as a whole was fairly low, with the most 
experienced palyer being unranked in the US Chess Federation and classify- 
ing himself as no more than an infrequent player. There were also a few 
Subjects who had never played the game before they attempted the practice 
sessions which were scheduled a week before the actual experiment. 

All but two of the individuals were part of either the Command, Control 
and Communications (6?) or Space Systems Operations Curriculum. Strategic 
decision making experience of the group was low, as would be expected with 
officers of the above rank. Tactical decision-making experience, on the 
other hand, was much more prevalent, with many subjects having extensive 


ground or naval tactical warfare training and/or experience. 


B. UMPIRE 

Along with the authors of this thesis, two other students of the e 
curriculum were used as umpires to control the Prerinener The umpire's 
job consisted of: preparing the two terminals and the Chess Challenger for 
playing the correct scenario at the appropriate intelligence level, giving 


the pre-experiment briefing to the subjects, providing the interface between 


the experimental computer program and the Chess Challenger, informing the 


‘These two umpires were also subjects, but acted as players before 
learning the umpires’ duties to insure their data points were not con- 
taminated by the additional information given to them on the experiment. 


36 





subject of the time left to make a legal move, and controlling the data 


collection aspects of the experiment. 


ee EQUIPMENT 

The equipment utilized to conduct the experiment consisted of: WAR 
Lab's VAX 11/780 mini-computer, two Digital Equipment Corporation VT-100/- 
102 video display terminals and keyboards [Figure IV-1], and a Chess Chal- 
lenger computerized chess game. [Figure IV-2] One VT-100 terminal was 
used by the player and the other by the umpire. During the experiment, 
these terminals were controlled with the software program described in 
Chapter III. The umpire operated the Chess Challenger. 

1. About the Chess Challenger 

The Chess Challenger is a computerized chess game manufactured by 

Fidelity Electronics, Ltd of Miami, Florida. This game supplied the pri- 
mary artificial intelligence tool utilized to figure all black's moves. It 
also was used to provide the evaluation function utilized in computing the 
player's relative board strength at specific times during play of the game. 
Additionally, the built-in timer of the Chess Challenger was used to keep 
track of the time left before the player was required to make the next 


] 
move. 


"This timing was for umpire and player information only. Timing for 
penalty assessment was accomplished by the software program running on 
the VAX 11-780. Software controlled times however could not be displayed 
at the terminals without seriously interfering with the game boards 
presently displayed on the VT-100's. 
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The Chess Challenger is an extremely powerful chess game capable of 
playing at anywhere from novice to tournament level chess. It has been 
ranked by the US Chess Federation at approximately 1825 - 1850. | Although 
capable of playing at an extremely high level, the very lowest level was 
chosen for this experiment. The average response time for the game to make 


a move at this level was 5 seconds. 


D. PROCEDURES 
1]. Physical Lay-out 
The equipment listed above was set-up in an isolated corner of bay 
3 of the WAR Lab during the execution of the experiment. The terminals 
faced each other with a 6' X 6' partition separating the player and umpire 
stations. [Figure IV-3] Partitions surrounded the player's working area to 
completely isolate them from outside interference in the lab. No distrac- 
tions such as clocks, other terminals, or printers were in view or ear-shot 
of the subject. 
2. Practice Session 
While designing the experiment it was determined that there was a 
Solid need for some training and/or familiarization in playing chess before 
the actual experiment could take place. Therefore, practice sessions on 
the computer were scheduled the week before the experiment started. Each 
subject was asked to log onto a terminal and play a chess game, similar to 


the experiment, for at least one hour. The practice game board display used 


‘This rank equates to a Class A player. Rankings are as follows: 
Grandmaster - 2600 and above, Senior Master - 2400 to 2599, Master - 2200 
to 2399, Expert - 2000 to 2199, Class A - 1800 to 1999, Class B - 1600 to 
1799, and Class C - Below 1600 points. [Ref. 8] 
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the same type of symbology board as the experiment. The way a move is 
entered was also identical. The practice chess game was played against the 
computer, which used a chess program different from that used in the actual 


rials. | 


3. Scheduling 

As described in the experimental design chapter, each subject was 
required to play the game four times. At the beginning of the experiment 
each individual was asked to sign up for three hours of time to play the 
four games. Depending on how fast the players made their moves, the four 
games would last anywhere from two to three hours. To try to avoid bore- 
dom and fatigue, the subjects were encouraged to sign up for three non- 
contiguous hours of play. 

4. Actions Before Each Game 

Before the start of each game a series of actions were required to 
be accomplished. 

a. The umpire initially would reset the program and make the 
selection as to how much intelligence the player would be allowed to have 
during this game. A menu would appear on the umpire terminal listing six 
options, corresponding to six different levels of intelligence to be pre- 


sented to white. [Figure IV-4] The umpire would choose the option 


"Although used for practice, this chess game was found to be entirely 
unsuitable for determining moves in the actual experiment. No documentation 
could be found on the game and no one had any idea where the game originated. 
Also, it could not be determined if the game had adequate AI to make intel- 
ligent and more importantly consistent moves. Another reason this game was 
not used in the experiment was because it was found to have some quite 
harmful end-game logic flaws wnich produced poor computer moves. 


oo 





corresponding to the master experimental schedule, and would then select 
an initial board set-up. Four game board set-ups were pre-programmed into 
the experiment. The umpire additionally had the option of entering an 
arbitrary set-up in case POT CHAnG nad gone wrong and a game had to be 
resumed at some place other than the initial set-up. 

b. Qnce the umpire had finished initializing the game, the player 
would then be asked to enter his or her name and experience level. There 
were four different experience levels the player could choose. [Figure IV-5] 

c. A pre-game briefing was then conducted by the umpire. The 
player's terminal would display the initial game board set-up, and explain 
to the player how much intelligence would be provided during this game. 

The umpire would insure that the player fully understood what was being dis- 
played and also inform the player of the time allocated to make all subse- 
quent moves. Finally the player was told what pieces were already captured 
and advised to make the first move when familiar with the pieces’ positions 
on the board. 

5. Board Display of Intelligence Levels 

Different combinations of the two fundamental board representations 
shown in Figures [II-2 & III-3 were used to display the six choices of 
intelligence which could be provided to the player. The intelligence 
given to the player when playing the game at intelligence level six is 
shown in Figure IV-6-f. This representation shows the greatest amount of 
intelligence a player can receive. All other levels are made up of subsets 
of the level six display. 

The board shown in Figure IV-6-a is a representation of level l, 


the normal game board. It shows all of the white and black pieces. This 
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is the only board displayed under level 1. Intelligence level 2 (Figure 
IV-6-b] displayed the same situation as level 1, but without any of the 
black pieces displayed. Level 3 [Figure IV-6-c] included the same display 
as shown in level 2, with the addition of a display which showed all of 
white's safe moves. The level 4 display [Figure IV-6-d] showed all of 
white's pieces and those black pieces that white was in a position to 
attack in a single move. Figure IV-6-e illustrates level 5. That display 
combines the attack board of level 4 with the safe board of level 3. 

Figures IV-6-a through f are depictions of the situation presented 
to the player at the beginning of a game under board set-up three. Since 
intelligence levels 2 and 3 were not used for the actual experiment, there 
were four possible displays of each of the four board set-ups, for a total 
of 16 different views a subject might see when play began. [Figure IV-7] 

6. Executing a Game Turn 

A game turn consisted of one move each by the player and the umpire. 
The player's turn would begin when the player had received the last move by 
the umpire and the playing board(s) had been updated. The subject would 
then have a maximum of two minutes to review the information provided and 
make the next move. The only exception to this timing requirement was the 
first move. Before the first move the subject would have as much time as 
desired to study the initial board position and make the first move. 

Once the player had made a move, the umpire would receive that move 
on the umpire terminal and enter the move into the Chess Challenger. The 
Chess Challenger would then derive a move for black. If no additional 
data collection was required during the turn, the umpire would enter Chess 


Challenger's move into the computer and the boards would be updated for the 
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player's next move. If data collection was required for that move, the 
umpire would be prompted by the terminal to enter an evaluation code which 
described the subject's board strength at that particular time of the game. 
This code would be obtained from the Chess Challenger and entered into the 
computer. The board would then be updated and a new turn would begin. 
7. Data Collection 

The evaluation code was automatically collected and recorded at 
game turns eight, nine, and ten. At game turn 10 the entire board was 
recorded. Additionally, by the use of the SAVE DATA function built into 
the software, data could be captured at any point during the game, at the 


umpire's request. 


E. ERROR CORRECTION AND RECOVERY 

There was no "take-back" or "whoops" command built into the software to 
enable a player to retract back a legal move that was already entered. As 
in the real game of chess, once a move was entered it could not be changed. 
There were, however, ways to correct errors in entries if necessary. The 
procedure used most commonly when an error required correcting or the 
computer went down was to reset the board and set up the initial board posi- 


tions to the situation of the board before the incorrect move. 
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V. EVALUATION OF DATA 


PemeeolceCTING THE DATA 

As mentioned earlier, our software automatically created a data file 
for each move. On the first move the subject's name and experience level, 
the intelligence level and initial board setup being played against, and a 
representation of the board at that instant were recorded. On all moves, 
the moves of White and Black in standard chess alphanumeric format were 
saved along with the elapsed time from when White got the W. prompt until 
a legal move had been correctly entered. At the eigth, ninth, and tenth 
moves the software also queried the umpire for an evaluation score which 
was obtained from the Chess Challenger. The evaluation took the form of 
a six character alphanumeric representation unique to the Chess Challen- 
ger and a "B" or "W" to indicate advantage to Black or White. At the 
tenth move the software also recorded the subject's name and experience 
level, the intelligence level and initial board setup being played against, 
and a representation of the board at that instant. 

The conversion of the evaluation code captured at moves eight, nine, 
and ten to our numerical measure of effectiveness (MOE) was a four step 
process. 

Step 1. Each of the four initial board setups were put into the Chess 
Challenger to obtain a baseline evaluation for that setup. Using a table 


in the Operator's Manual for the Chess Challenger, the six character eval- 
uation was decoded into a numerical score. A score showing advantages to 


White was recorded as positive; advantage to Black was negative. 
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Step 2. The data files were printed out and the six character evalua- 
tion code at the tenth move was similarly decoded into a raw score. Because 
all of our subjects were losing at move ten, all the raw scores were negative. 

Step 3. The penalty for excessive time for the subject to enter moves 
was calculated. This was done by observing on the data file the elapsed time 
for White's moves on the second through tenth moves. Any times greater than 
120 seconds per move were summed to obtain a total penalty time in seconds. 
For each minute, or fraction thereof, of penalty time the subject lost a 
number of points equal to the value of one pawn (256 points). For example, 

a total penalty time of 75 seconds, or 1.25 minutes, results in a penalty of 
fac oG = 5)'2 points. 

Step 4. The raw score obtained in Step 2 was adjusted for time penalties 
and initial setup advantage by subtracting the results of Steps 1 and 3 to 
arrive at the MOE. Note that an initial setup advantage to Black, a negative 
number, causes the MOE to be more positive because a negative number is sub- 
tracted. This is as it should be because it rewards White for overcoming an 
initial disadvantage. 

Three occasions arose where the subject was checkmated before move ten. 
An arbitrarily large negative score of -99999 was assigned in those cases 
and used as the MOQE. 

A new data file containing six columns was then built and is included 
as Table V-1. The first column is a subject identification number that 
matched each subject's name. Since the specific performance of any 
particular individual was not an issue, the corresponding names were not 
provided. Column 2 is the representation of each subject's chess playing 


experience: "1" for a complete novice, "2" for the subject who was 
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familiar with chess but not a regular or frequent player, "3" for a Frequent 
player, and "4" for a tournament player. The data in column 3 is the 
intelligence level presented to the subject on that trial. The data in 
column 4 gives the initial board setup for each trial. Column 5 contains 
the MOE. The value in column 6 indicates the sequence in which this trial 
occurred, i.e., whether it was this subject's first, second, third, or 


mourth trial. 


B. ANALYSIS 
1. Initial Quick Look 
The first look at the MOE data showed huge variances that resulted 
from six outliers in the one hundred twenty-four trials. Of those six, three 
were cases in which the subject was checkmated. The other three were 
instances where checkmate was imminent. The largest of these scores was 
-30556; the smallest of the remaining scores was -/151. The six exagger- 
ated scores were made by five different subjects, against three of the four 
intelligence levels and three of the four initial setups, and occurred on 
the second through fourth trials. In other words, they appear to be 
randomly dispersed. 
2. Handling The Dilemma 
Proper treatment of these outliers was necessary to proceed further 
with any statistical analysis. After investigating several potential paths 
we decided to recode the six exaggerated scores to a value lower than the 
lowest in the main body of data points but not so disastrously low as 


that initially coded. We selected, arbitrarily, the value -9000. 
We feel this was a reasonable approach because the MOE for a 


checkmate was arbitrarily set and the value we picked was sufficiently 
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large to set it off from those MOE's arrived at otherwise. This method 
allowed continued analysis without reducing the size of our data base 
while preserving the significantly more disastrous results on those six 
Gevals. 

All further analysis and conclusions refer to this “adjusted 
data." 

3. Determine Which Parameters Were Siaqnificant 

The next step was to determine which of the factors in the 
mathematical model were statistically significant. To do this we used a 
general linear model procedure known as the "Extra Sum of Squares" method 
[Ref. 9]. 

The basic idea of this method is to do an analysis of variance 
(ANOVA) using the entire model. Then repeat the ANOVA on a reduced model 
that omits the parameters corresponding to the factors under investigation. 
The difference in the model sum of squares for the two runs is due to the 
influence of this factor. Using the two sums of squares an F statistic 1s 
then calculated and used to indicate the significance of the factor or 
factors under consideration. The equation is: 


F = [RSS(f) - RSS(?)] / [DF(f) - DE(?2)] 
~_ESS(#) 7 [DF(t) - DF(F)] 


where: 
RSS(f) = Regression Sum of Squares, full model 


RSS(?) = Regression Sum of Squares, modified model 


DF(f) = Degrees of Freedom of regression, full model 
DF(?) = Degree of Freedom of regression, modified model 
DF(t) = Total Degree of Freedom, same for either mode] 


ESS(f) = Error Sum of Squares, full model 
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The specific values and results of the computations appear in 
Table V-2. At a 95% confidence level, the intelligence level, and the 
initial board setup were both statistically significant. The low calcu- 
lated F-statistic for trial number shows that the experiment design suc- 
cessfully precluded "learning" from effecting the results. Also as 
expected, the subject's experience as a chess player was significant. 
Inspection shows that those with the most experience scored highest. 

The intent in the experimental design had been to make the initial 
board setup insignificant. Since the results showed this was not the 
case, board setup was investigated further as was the effect of intelli- 
gence level provided. 

Smeeechlcical Factors im the Significant Parameters 

To determine which levels were significantly different in the 
factors intelligence level and board setup we used the Scheffe multiple 
comparison analysis of variance procedure. 

The basic idea of “Scheffe's Test" is to compare the means of the 
Samples of concern two at a time in all combinations of two and arrive at 
simultaneous 95% confidence levels for the differences of any pair of 
levels of a factor. As an example, a Scheffe's multiple comparison of 


three sample populations would say that with 95% confidence, all the 
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wnere 


F .05 = critical value of F (with r-1 and r(n-1) degress of freedom) 
leaving 5% in the upper tail 
7 
e 


Ze =e ] 
Sp pooled standard deviation, 3 rs 
i=] 


r = number of means to be compared 


n= Sample sizes. 


Wonnacott and Wonnacott [Ref. 10] provides a good illustration. 

The results of Scheffe's Test for Intelligence level showed that 
levels 1 and 3 were significantly different from each other but neither 
varied significantly from levels 5 and 6. When applied to the initial 
board setup, scores against setup 3 were significantly worse than against 
setups I, 2, or 4. Table V-3 provides the data leading to these conclu- 


S10nS. 


C. WHAT WENT WRONG WITH SETUP-3? 

We knew from our evaluation of the initial board setups that, as far 
as the Chess Challenger was concerned, setup 3 was the second most advan- 
tageous for White so the answer was not in the numerical realm. We had 
been the umpire for approximately 95% of the trials and began to think 
about what we had observed while the subjects faced that setup. In com- 
Daring notes we found that we both had observed many instances of our 
subjects falling into an unintentional trap in the first few moves. From 
the initial positions shown in Figure II-3, White almost always made the 
apparently optimal move of queen takes rook at QB3 to which Black 


responded by knight takes pawn at White's K4. Probably due to their lack 


of chess skills and the unfamiliarity of our symbolic board representation, 
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the vast majority of our subjects failed to recognize that Black's knight 
now forked their queen and the bishop at KN5. Many saw the threat to the 
bishop only or the threat to their knight at KB3 from Black's bishop and 
moved accordingly. Black then captured White's queen and the victim 
never recovered from the sudden early loss. On several occasions it 
seemed the psychological impact of the queen's loss at this stage was so 
staggering to the subject that it was worse than the material loss. Per- 
haps a similar thing happened to the numerically superior French army in 
1939 when the German army swept around the Maginot Line. After the 
Germans rendered useless what had been the centerpiece of the French 


defense, the French army was quickly defeated. 


D. WHAT IF SETUP-3 IS OMITTED? 

The Extra Sum of Squares and Scheffe's Test procedures were repeated 
on a modified data file that omitted all trials against initial setup 3. 
Again, the intelligence levels 1 and 3 were different from each other. 
Neither was statistically different from levels 5 or 6. Initial board 
setup was not significant. Our interpretation of these results is 


discussed with the rest of our conclusions in Chapter VI. 
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VI. CONCLUSIONS AND RECOMMENDATIONS 


A. CONCLUSIONS 
1. An Optimum Amount of Information Exists 

Performance on the simulated battlefield tends to improve as the 
overall amount of information about one's opponent increases up to some 
optimum level. We observed that as our subjects were given more informa- 
tion about Black's strength and position their scores improved until, at 
some point, the additional information was too much to be effectively 
utilized in the time allowed. Scores against Intelligence Level 3 (the 
least information) were significantly worse than when the next higher 
amount of information was presented in Intelligence Level 5. Scores against 
Intelligence Level 5 were similarly not as high as against the next higher 
amount of information provided in Intelligence Level 1 (the normal view of 
the board). Additional information beyond that point served only to con- 
fuse the situation. This resulted in degraded performance. Scores against 
Intelligence Level 6, which displayed the most information, were signifi- 
cantly lower than against Intelligence Level 1. The possible reasons for 
this are multiple. 

On the one hand, the additional information may be simply too much 
information to be assimilated in the time allotted. A direct analogy can 
be drawn to a military command center into which messages flow at a faster 
rate than they can be digested or acted upon. They pile up all over the 
command center perhaps obscuring other information. Important data gets 
lost with the general deluge because it cannot be spotted and separated 


from the chaff. 
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Another possibility is that the total amount of information is not 
necessarily excessive but that in the format in which presented it is ex- 
cessive. To illustrate this point consider the information contained in 
this paragraph. It can be easily read and eee In a few moments. 
However, if the same amount of information (i.e., this paragraph) were given 
to the reader as a block of dots and dashes along with a copy of the Inter- 
national Morse Code the average reader would have significant difficulty 
understanding it. It is important to note that our experiment was not 
about the method of presentation but the quantity of information presented. 
Within that context, our results still hold. There will be some optimum 
amount of information that can be utilized by a particular subject for each 
Separate method of information presentation. Beyond that point too much 
time 1S Spent in deciphering the presentation to allow adequate time for 
digesting it and formulating a plan of action. 

In the limit, of course, there will exist some quantity of data that 
1S excessive regardless of the method of presentation. We have the physical 
ability to pass that saturation threshold now by stacking teletype machines 
and communications systems in our command centers. We also have a tendency 
to overkill at every level. No Captain wants to tell the Admiral, "I don't 
know", when asked a question so the Captain ensures the information is 
there to cover any area about which the Captain thinks the Admiral might 
ask. Likewise, the Lieutenants to whom the Captains turn with their ques- 
tions try to ensure they will always have the answers available. And so 
the quantity of information we may think we desire continues to mushroom. 

A very real and continuing problem of modern warfare is how to adequately 


balance the capability to provide information, desirability of having given 


=| 





data, and the optimum display of the information that is desired. The 
interaction between the method of display and the amount of information 
becomes increasingly important as the amount of information desired 
becomes larger. 
2. Experience and Training Help 

We observed the subjects with more chess playing experience 
tended to score higher than subjects with less experience. A direct 
analogy can be drawn to the battlefield. To exaggerate the obvious, one 
would not expect a new second lieutenant to fare as well directing an 
army aS an infantry lieutenant general with thirty years' experience. 
Likewise, one would expect a vice admiral of similar experience to fare 
better in command of a carrier battle group than the lieutenant general 
would. 

3. Psychological Impact Can be a Major Factor 

As mentioned earlier, on several occasions while playing against 
setup 3 our subjects lost their Queen in the first few moves. This was 
a significant material loss in each instance. But in some cases the 
psychological impact seemed even more devastating. Some subjects were 
visibly upset for several moves afterward and never regained control of 
the situation. They were thereafter unable to mount a coordinated 
attack. A direct analogy can be drawn to the effect of a serious loss 
early in an engagement. As a hypothetical situation, consider a 
Carrier battle group preparing for an approaching air raid that should 
be easily repulsed. Just as the incoming bombers are detected and the 
anti-aircraft plan starts to unfold the carrier suffers an internal 


explosion from dropped ordinance and is put out of action. Despite the 
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loss of the carrier, the remaining fighter aircraft and surface combat- 
ants should be able to repulse the raid. The remaining escorts could 
then nurse the carrier back to a safe area and regroup for further 
action. However, it is conceivable that the early critical damage to 

the carrier could cause significant disorientation of the defense mani- 
fested in screen disintegration and in wasted time and effort to find and 
combat a nonexistent submarine threat (this would not have to be pro- 
longed but simply a distraction from the task at hand). The result 


could be significantly greater effectiveness of the air raid. 


B. CONCLUSIONS THAT CAN NOT BE REACHED 

Why was our optimal amount of information optimal? Was it because 
the absolute amount of information given to the subjects and the method 
of display were in proper balance or was it because that presentation 
most nearly resembled the normal view of a chess board with which our 
Subjects were all somewhat familiar? We cannot answer that question 
from our experiment. We suspect that familiarity was a factor in making 
that particular display optimal. However, it can be argued that the 
amount of information was still the major factor because the same dis- 
play was included as a portion of the information level 6 display against 


which our subjects scored more poorly. 


C. RECOMMENATIONS FOR FUTURE WORK 


1. Expand the Sample Size 
With a larger sample space one would expect the results to vdecome 


more clear cut. Perhaps the adjustments we had to perform on the outlying 


scores could be done away with and those points omitted. With the small 
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original sample size, omission of those scores produced inconclusive 
results. The new samples could also be analyzed as a separate group 
and those results compared to the original. 
2. Compare Methods of Information Display 
The experiment could be run using only one level of information 
but displaying it in a variety of ways. The display methods could 
include: 
a) the same as in this experiment 
b) the same as in this experiment but allow the subject to use 
a standard chess board and pieces for manual manipulation as 
a decision or visualization aid 
c) use the RISNEY/TSCHUDY project from 0S-4602, C3 Systems 
Evaluation (Fall Qtr 1983), to display the chess board and 
pieces as iconic symbols on the RAMTEK color monitors in the 
C3 laboratory. This software produces an easily manipulated 
computer generated color graphic representation of the board 
with standard shapes for all the pieces. 
3. Start From the Opening Move of the Game 
This experiment started the subject at mid-game with the explicit 
intention of denying the subject any prior intelligence as to the exact 
Strength and disposition of the opponent. The experiment could be run 
with the game always beginning at the first move and proceeding for some 
longer number of moves well into middle-game. The number of moves to 
play would have to be determined by trial runs to re-establish a good 
sample point. A potentially confounding element that must be investigated 


is the impact of chess playing experience. The better players would be 
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expected to play a better opening game. This could drastically affect 
the number of moves before checkmate and therefore the appropriate sample 
point. A large enough sample set of experienced players with nearly 


equal ratings may be able to avoid the problem. 


4. Test the Relation Between Experience and the Amount or Method of 
Display of Information 


Based on subjective observation by the umpires, when playing 
against information level 6 the more experienced players relied less on 
the safe position and possible attack portions of the display while in- 
experienced players used them heavily. That hypothesis could be tested 
but the difficulty would lie in how to measure utilization of the various 
portions of data displayed. With the equipment currently available at 
the Naval Postgraduate School, that could only be done very subjectively 
with questionnaires for the subjects. Though not available here, there 
exist in commercial use devices for accurately measuring how the human 
eye scans an area. These could be used to quantitatively examine the 
percentage of time a subject actually looked at any given sector of the 
display. 

Other related experiments are certainly possible. Our experiment 
was never contemplated as exhaustive, but more as a beginning. The 
field of information management is becoming ever more complicated and 
ever more important. Therefore, the potential for experiments such as 
this to serve a useful purpose increases. Perhaps the same idea could 
be used in specific applications to improve the utility of information 
displays in command centers if adequate control of experimental factors 


could be established. 
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Figure II-2 
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es ah Queen 

=o King 

oS Or Se iniew © oa nd 
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TABLE Ve-1 


PAPE REMENT DATA FILE 


Sup JECT EXP INT Ste oCURE TRIAL 
ID LVL LVL UE 
2 -3237 
-1155 
-~1477 
-1834 
-302 
-319 
-2181 
-2317 
-2041 
-5722 
-1901 
-~2915 
-5558 
-5968 
-2896 
-6812 
-~3351 
-6207 
-2599 
-2308 
-1481 
-3058 
-1000 
-2770 
-682 
-2319 
-1907 
-2907 
-1408 
-2643 
-4320 
-3963 
-3480 
-3337 
-2711 
-2473 
-3026 
-1263 
-1618 
-1545 
-3731 


ho 

ho 
ho GW BO HD BD PDR NH HW NH FD NN DD DH WHF NNN ND DD ND WHNN NN NH 
WOO ee UT OT Ot ON 0 UT ON WD OT ON UT ON GD UT ON OW UTR DOD UTE ON Ue 
WOW KB FE WWNY NDR RR Ree EEE EKO WWW WWNH DDN ND ND RRR eee 
KO KO BK HO HO HO HO HO HO BDO Fe OO 


Oo oO ~I Os One DN Ee 


penn 
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TABLE V-1 [continued] 


SUBJECT EXE NG SET SCORE TRIAL 
LD LVL LVL UE 
iP Z S: 4 -1309 Z 
eZ Z 5 4 ae) 2 Z 
Ns 2 S 1 -2573 Z 
14 Z 3 oo o2o7 Z 
15 2 > Z -1494 2 
IF Z 5 Z -593 Z 
ey 2 6 4 -936 Z 
rs 1 6 4, -2426 2 
Vg) 2 6 1 -3340 Z 
20 5 3 i -1731 2 
ZA Z 6 Z -1717 ze 
aU 2 6 2 -1224 2 
Ze Z 6 3 -4740 Z 
24 2 S 5 -3041 Z 
ZS Z 5 2 -2360 2 
ZO 2 i 2 -4076 ee 
Zi 1 1 5 -2861 Z 
28 Z 1 3 -859 Z 
Z9 2 2D 4 220 2 
30 5 1 4 -1033 2 
el 2 1 I -188/7 Z 
i Z 3 5 -6247 5 
Z 1 3s 4 -33135 5 
2 a 5 Z -335 3 
4 2 3) 4 -1525 3 
5 2 6 Z -3091 3 
6 2 6 S = 22 5 
aL 2 6 S -1843 3 
8 2 3 4 -3093 3 
9 3 > 1 -334 3 
10 2 1 4 -888 3 
ek 2 1 1 -2698 § 
eZ 2 ] 3 -2382 5 
13 Ze 6 Z -1472 3 
14 2 6 4 -1000 3 
iS Z 6 i -489 3 
16 Z 3 4 -1904 5 
Ly Z 5 i -1947 3 
ING 1 1 Zz -2995 3 
We) Z i 2 -1654 5 
Zw 5 I S -3844 3 
Za y 5 1 -/151 S 


ie 





TABLE V-1 [continued ] 


SUE oC T Ee ea Sal, oCORE TRIAL 
ID ey etsy Ie Dis 
ZZ Z 3 eee oe 8 
18 2 5 | -1494 3 
24 2 2D 2 -1244 C 
JES, Z 5 2 Sa 3 
26 Z 3 4 -2486 3 
2] i = 2 -2508 S 
28 Z 5 4 -2776 5 
Zo Z 6 Z -2429 5 
30 3 6 3 -4494 3 
ou 2 6 3 -3754 5 
] Z 6 4 Sale 4 
2 ] 6 Se 29979 4 
5 2 6 4 -2395 4 
4 a 3 Z -3808 4 
5 2 3 Se -5058 4 
6 2 3 2 -5463 4 
7 2 5 4 -99999 4 
8 2 5 eo 20 4 
2 3 6 4, -1014 4 
10 Z 6 1 =3339 4 
11 Z 6 3 -947 4 
ie2 2 2 ] -2897 4 
166: Z 3 4 -1006 4 
14 2 1 2 -1621 4 
15 2 i 4 -989 4 
16 Z i ] -2306 4 
i/ 2 3 Z -3052 4 
18 1 2 I -652 4 
ey, 2 = S -530/7 4 
20 3 S, 2 -1356 4 
al 2 > a -881 4 
Ze 2 ] I -1731 4 
Zo 2 1 2 -1694 4 
24 2 ] 1 -1413 4 
22) e 6 4 -1201 4, 
Zz 2 6 3 -3938 4 
27 1 6 4 -1660 4 
28 Zz 3 Z -5882 4 
Zo 2 3 2 -2717 4 
30 S 3 2 -1371 4 
OL 2 5 4 -1586 4 
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TABLE V-2 


EXTRA SUM OF SQUARES DATA 


RSS(C£) = 114669799 

Boett) = 387368583 

DF(f) = 8 

DF(t) = 115 

mest for 

eetect of: ROS Ga) De@emuccale.,)  Sienificance Level 
INTELLIGENCE 86943082 5 ye 0.0466 

SETUP 48858404 5 Gro) 0.0004 
EXPERIENCE 97773847 i Se On O0270 

PReAL NO. PL4aQ029.6o1 ih 8) Ars: ODT TLS 


Note 1: Tabulated F-statistics are for the 954 confidence level. 
Note 2: Those factors for which F(calculated) is greater than 
F(tabulated) are significant at the 95% confidence level. 
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TABLE V-3 


peneEPE Ss LEST RESULTS 


Confidence Level = 95% 


BY INTELLIGENCE LEVEL: 


Grouping Mean INTELLIGENCE LEVEL 
A =220/23 1 (normal board view) 
BoA — 262: 2 6 (most information) 
pea =J7 22.7 5 
B -3588.9 eroleast information) 
imrmm Sienanicant: difference = 1335.16 


Peni TAL BOARD SET UP: 


Grouping Mean Sec. Up 
A wag. 1 
A —-2335.9 4 
A -2589.6 2 
B -4070.2 3 
Minimum significant difference = 1335.16 


Note: The letters A and B in the Grouping columns have no 
Special meaning. They serve only to illustrate which sample 
means are within simultaneous 954 confidence intervals of 
each other. 
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